In vivo immobilization of proteins

ABSTRACT

Provide is a novel system for immobilizing re-combinantly produced proteins by entrapping them in crystals of co-expressed proteins that are capable of self-crystallization . Related compositions and as well as methods of making and using the immobilized proteins are also described.

RELATED APPLICATIONS

This application is a 371 National Stage Entry of PCT Patent ApplicationNo. PCT/CN2020/086652, filed Apr. 24, 2020, which claims priority toU.S. Provisional Pat. Application No. 62/839,400, filed Apr. 26, 2019,the contents of which are hereby incorporated by reference in theentirety for all purposes.

REFERENCE TO SUBMISSION OF A SEQUENCE LISTING AS A TEXT FILE

The Sequence Listing written in file 080015-1258878-027710US_SL.txtcreated on Sep. 20, 2022, 111,479 bytes, machine format IBM-PC,MS-Windows operating system, is hereby incorporated by reference in itsentirety for all purposes.

BACKGROUND OF THE INVENTION

In many medical and industrial contexts the use of recombinant proteinsis becoming increasingly more important. The production, isolation, use,and potential reuse of recombinant proteins, especially enzymes, remainin need of improvement for better quality, efficiency, and stability.For instance, enzyme immobilization can facilitate easy removal andsubsequent reuse of enzymes during multiple rounds of catalysis. In manycases, immobilization also improves the stability of enzymes againstmany industrial conditions such as high temperatures and organicsolvents. Immobilization by entrapment is particularly attractive sinceit does not involve any modifications to the enzyme structure,increasing the chance for high activity retention and native enzymeconformation. Thus, there exists a need for new and effective means forproducing recombinant proteins, such as enzymes, in immobilized form.The present invention fulfills this and other related needs.

BRIEF SUMMARY OF THE INVENTION

This invention provides a novel approach to improve the physicalproperties and stability of recombinant proteins by immobilizingrecombinant proteins such as enzymes useful in various industrialapplications. Thus, in a first aspect, this invention provides a methodfor recombinantly co-expressing a protein of interest with acrystal-forming protein. The method includes these steps: (1) providingbacterial cells comprising an expression cassette encoding the proteinof interest and an expression cassette encoding a Cry protein, acrystal-forming fragment of the Cry protein, or a fusion protein capableof forming crystals comprising the Cry protein or the crystal-formingfragment thereof; and (2) culturing the bacterial cells under conditionspermissible for the expression of the protein of interest as well as theCry protein, the crystal-forming fragment thereof, or the fusionprotein, wherein the Cry protein, the crystal-forming fragment thereof,or the fusion protein forms crystal containing the protein of interestupon both being expressed in the bacterial cells.

In some embodiments, the protein of interest is an enzyme, such as alipase (e.g., Proteus mirabilis lipase, or PML, including a PML variantwith modifications of residues 118 and 130, for example, I118V + E130G,and lipA or lipAR9), ligase, hydrolase, esterase, protease, orglycosidase. In some embodiments, the Cry protein is Cry3Aa. In someembodiments, the crystal-forming fragment is the N-terminal 290 aminoacids of Cry3Aa, or the N-terminal 625 or 626 amino acids of Cry3Aa, orthe 498-644 fragment of Cry3Aa. In some embodiments, the expressioncassette encoding the protein of interest and the expression cassetteencoding the Cry protein, crystal-forming fragment thereof, or thefusion protein is one and the same expression cassette. In someembodiments, the one single expression cassette comprises (1) one copyof polynucleotide sequence encoding the Cry protein, crystal-formingfragment thereof, or the fusion protein, and (2) one copy or two or morecopies of polynucleotide sequence encoding the protein of interest. Insome embodiments, (1) the polynucleotide sequence encoding the Cryprotein, crystal-forming fragment thereof, or the fusion protein, and(2) the polynucleotide sequence encoding the protein of interest areoperably linked to one single promoter. In some embodiments, the onesingle promoter is operably linked to (1) one copy of the polynucleotidesequence encoding the Cry protein, crystal-forming fragment thereof, orthe fusion protein, followed by (2) one copy of the polynucleotidesequence encoding the protein of interest, with one ribosome bindingsite between (1) and (2). In some embodiments, the one single promoteris operably linked to (1) one copy of the polynucleotide sequenceencoding the Cry protein, crystal-forming fragment thereof, or thefusion protein, followed by (2) two or more copies of the polynucleotidesequence encoding the protein of interest, with one ribosome bindingsite between (1) and (2) and between two copies of polynucleotidesequence encoding the protein of interest.

In some embodiments, (1) the polynucleotide sequence encoding the Cryprotein, crystal-forming fragment thereof, or the fusion protein; and(2) the polynucleotide sequence encoding the protein of interest areoperably linked to two separate promoters. In some embodiments, the twoseparate promoters are two different kinds of promoters, for example,cytlAa promoter and cry3Aa promoter. In some embodiments, (1) thepolynucleotide sequence encoding the Cry protein, crystal-formingfragment thereof, or the fusion protein; and (2) the polynucleotidesequence encoding the protein of interest share one single terminationcodon, resulting in one copy of the Cry protein, crystal-formingfragment thereof, or the fusion protein and two copies of the protein ofinterest.

In some embodiments, the expression cassette encoding the protein ofinterest and the expression cassette encoding the Cry protein,crystal-forming fragment thereof, or the fusion protein are two separateexpression cassettes. In some embodiments, the fusion protein comprisesthe Cry protein or crystal-forming fragment thereof and one or moreheterologous polypeptides (such as a lipase lipA or SEQ ID NO:8 in 1-3repeats) at the N-and/or C-terminus. In some embodiments, the fusionprotein is Cry3Aa-[SmtA]₁₋₃. In some embodiments, two or more proteinsof interest are co-expressed with the Cry protein, the crystal-formingfragment thereof, or the fusion protein and are contained within thecrystal formed by the Cry protein, the crystal-forming fragment thereof,or the fusion protein. In some embodiments, the bacterial cells areBacillus subtilis (Bs) or Bacillus thuringiensis (Bt) cell or E. colicells. In some embodiments, the method of this invention furtherincludes a step, prior to step (1), of introducing into the bacterialcells the expression cassette encoding the protein of interest and theexpression cassette encoding the Cry protein, crystal-forming fragmentthereof, or the fusion protein. In some embodiments, more than oneprotein of interest, e.g., two or more proteins, are recombinantlyco-expressed with the Cry protein, crystal-forming fragment thereof, orthe fusion protein. These proteins of interest may be the same protein(e.g., both are PML) or different proteins (e.g., one lipase and oneligase). In some embodiments, the method of this invention furtherincludes a step, after step (2), of isolating the crystal formed by theCry protein, the crystal-forming fragment thereof, or the fusion proteinand entrapping the protein or proteins of interest, which may be morethan one protein, e.g., two or more proteins. In some embodiments, afterit is isolated, the crystal containing the protein(s) of interest iswashed under appropriate conditions such as choosing appropriate saltconcentration etc. to permit the protein(s) entrapped within the crystalto be released from the crystal, preferably without dissolving thecrystal to any substantial degree. In some embodiments, after beingisolated, the crystal containing the protein(s) of interest is dissolvedto release the protein(s) entrapped within the crystal. In someembodiments, the protein is an enzyme, such as PML. In some embodiments,the protein of interest is a fluorescent protein such as mCherry.

In a second aspect, the present invention provides a protein crystalproduced by the method described above and herein for co-expression ofone or more recombinant proteins of interest with a crystal-formingprotein, such as a Cry protein, a crystal-forming fragment of the Cryprotein, or a fusion protein capable of forming crystals comprising theCry protein or the crystal-forming fragment thereof. In someembodiments, the protein of interest is an enzyme, such as a lipase(e.g., Proteus mirabilis lipase, or PML, including a PML variant withmodifications of residues 118 and 130, for example, I118V + E130G, andlipA or lipAR9), ligase, hydrolase, esterase, protease, or glycosidase.In some embodiments, the protein of interest is a fluorescent proteinsuch as mCherry.

In a third aspect, the present invention provides a method forperforming a reaction. The method comprises the step of incubating theprotein crystal produced by the method of this invention entrapping anenzyme therein with a substrate to the enzyme under conditionspermissible for the substrate to be catalyzed by the enzyme. In someembodiments, the enzyme is a lipase (e.g., Proteus mirabilis lipase, orPML, including a PML variant with modifications of residues 118 and 130,for example, I118V + E130G, and lipA or lipAR9), ligase, hydrolase,esterase, protease, or glycosidase. In some embodiments, the methodfurther comprises a step, after the reaction is completed, of removingthe reaction product and cleaning, e.g., washing the crystal to removeany detectable amount of reaction mixture including reaction agentsand/or product(s), preferrably without dissolving the crystal to anysubstantial degree, and then reusing the protein crystal in a secondreaction.

In a fourth aspect, the present invention provides an in vitro methodfor co-crystallizing (1) a Cry protein, a crystal-forming fragment ofthe Cry protein, or a fusion protein capable of forming crystalscomprising the Cry protein or the crystal-forming fragment thereof with(2) one or more proteins (e.g., enzymes) by mixing a soluble proteindescribed in (1) with the protein or proteins of (2), thus allowing theprotein or protein(s) of (2) to be entrapped within a protein crystalformed by the crystal-forming protein of (1). In some embodiments, theprotein is an enzyme. In some embodiments, the enzyme-entrapped proteincrystal so formed is used for performing a reaction where the proteincrystal is incubated with a substrate to the enzyme under conditionspermissible for the substrate to be catalyzed by the enzyme. In someembodiments, the protein crystal containing entrapped protein(s) ofinterest formed by the methods described above and herein is used fordelivering the protein(s) to cells, such as macrophages, lymphocytes,cancer cells, red blood cells, epithelial cells, stem cells, and livercells.

In a fifth aspect, the present invention provides various modified Cryproteins, their fragments or fusion proteins, all of which stillretaining the crystalizing capability (for example, Cry3Aa*,Cry3Aa*-lipA, Cry3Aa*-SpyCat, Cry3Aa-[SmtA]₁₋₃, NegCry3Aa, 3A2-2, Cry3Aa(S145C, H161R), as well as various modified proteins with retainedenzymatic activities (such as PML, PML^(VG), LmSP, LipA, and LipAR9), apolynucleotide sequence encoding each of such proteins, a nucleic acidcomprising the polynucleotide sequence encoding each of the proteins,especially an expression cassette comprising the polynucleotide codingsequence operably linked to a promoter (e.g., a heterologous promoterfrom an origin different from that of the wild-type base protein) or avector comprising such an expression cassette, a host cell comprisingsuch a vector or expression cassette, which is able to express theprotein under permissible culture conditions. Also provided are methodsfor recombinantly producing any of these proteins by culturing the hostcells under conditions permissible for the recombinant proteinexpression and optionally further concentrating or isolating/purifyingthe proteins.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 . (a) Electron micrograph of Bt. highlighting its crystalinclusion. Electron micrographs of purified (b) Cry3Aa crystals, and (c)Cry3Aa-GFP fusion crystals (reproduced from reference 15).

FIG. 2 . Recyclability of Cry3Aa*-lipA for the conversion of coconut oilto biodiesel.¹⁶

FIG. 3 . Schematic illustrating the potential mode of entrapment of PMLwithin Cry3Aa crystals (modified from reference 16).

FIG. 4 . Plasmid variations constructed for optimization of PMLentrapment in Cry3Aa crystals. 3P- cry3Aa promoter; RBS- ribosomebinding site.

FIG. 5 . SDS-PAGE analysis of PML^(VG) entrapped in Cry3Aa crystals bydifferent coexpression methods. (a) Analysis of Cry3Aa[PML^(VG)] and3PPML^(VG)[Cyt13A] crystals and (b) Cry3Aa[PML^(VG)]² and3P3A[3PPML^(VG)] crystals. Lane (M) molecular weight marker (kDa).

FIG. 6 . Scanning electron micrographs of Cry3Aa and 3P3A3P[PML^(VG)]crystals.

FIG. 7 . p-Nitrophenyl palmitate (pNPP) hydrolysis activity of PML^(VG)entrapped in Cry3Aa crystals by different coexpression methods.

FIG. 8A. Recyclability of optimized Cry3Aa[PML^(VG)] crystals during theconversion of waste cooking oil to biodiesel. Cycles were 10 h each.FIG. 8B. Thermal stability of free PML^(VG) and 3P3A[3PPML^(VG)]crystals. Samples were incubated in triplicate at various temperaturesfor 1 h and then tested for residual activity. Error bars show thestandard deviation of the mean.

FIG. 9 . GCMS analysis of benzyl laurate synthesis reaction. (a) Elutionprofile of no enzyme control shows no benzyl laurate peak. (b) Elutionprofile of lipase crystals reaction to produce benzyl laurate. (c)Mass-spec of benzyl laurate peak confirms product formation.

FIG. 10 . In vitro co-crystallization of Cry3Aa and carboxy rhodaminelabelled PML by vapor diffusion. Crystals with red fluorescence but noUV fluorescence are crystals comprised of PML only. Crystals with UVfluorescence but not red fluorescence are Cry3Aa crystals that do notcontain PML. Entrapment of PML inside Cry3Aa crystals is confirmed byfinding crystals with both red and UV fluorescence..

FIG. 11 . Fluorescence microscopy of Cry3Aa co-entrapment of GFP andmCherry fluorescent proteins. The data demonstrate that GFP and mCherryproteins are entrapped in single Cry3Aa crystals.

FIG. 12 . Production and analysis of LmSP entrapment in Cry3Aa crystals.(a) SDS-PAGE analysis of 3P3A3P[LmSP] and 3P3A3P[LmSP][PML^(VG)]crystals. (b) Sucrose phosphorylase activity and (c) lipase activity ofcoentrapped crystals.

FIG. 13 . Production and characterization of Cry3Aa*[PML^(VG)]²crystals. (a) SDS-PAGE gel of purified Cry3Aa*[PML^(VG)]² crystals. Lane(1) molecular weight marker (kDa), Lane (2) crystals. (b) pNPPhydrolysis activity of Cry3Aa*[PML^(VG)]² crystals.

FIG. 14 . Production and activity of the lipase:lacasse catalyst. (a)SDS-PAGE gel showing conjugation of CotA-Spytag toCry3Aa*-SpyCat[PML^(VG)] crystals at different ratios. Lane (M)molecular weight marker (kDa). (b)2,2′-azino-bis(3-ethylbenzothiazoline-6-sulphonic acid) (ABTS) oxidationby free CotA-Spytag and Cry3Aa*-SpyCat[PML^(VG)] crystals.

FIG. 15 . Production and characterization of Cry3Aa*-lipA[PML^(VG)]crystals. (a) SDS-PAGE analysis of purified Cry3Aa*-lipA[PML^(VG)]crystals. Lane (1) Pure crystals; lane (M) molecular weight marker(kDa). (b) Time-point study of waste cooking oil conversion tobiodiesel.

FIG. 16A. Scheme 1: Lipase catalyzed tranesterification oftriacylglycerols with methanol (MeOH) to produce biodiesel. Scheme 2:Lipase catalyzed esterification of benzyl alcohol and lauric acid inneat isooctane to produce the cosmetic compound benzyl laurate.

FIG. 16B. Scheme 3: Reaction scheme for the one-pot synthesis ofbiodiesel and αGG using PML lipase and LmSP sucrose phosphorylase.Scheme 4: Reactions scheme for one-pot production of azelaic acid usingPML lipase and CotA-lacasse.

FIG. 17 . Amino acid sequence alignment between Cry3Aa (SEQ ID NO:4) andNegCry3Aa (SEQ ID NO:6).

FIG. 18 . Purified negCry3Aa-RBS-scGFP protein crystals. (A)NegCry3Aa-RBS-scGFP crystal pellets purified from Bt cells grown for 48h (left) and 72 h (right) were imaged using BioRad ChemiDoc MP System atthe GFP excitation wavelength (ex=488 nm). (B) SDS-PAGE gel showing thecorresponding purified NegCry3Aa-RBS-scGFP crystals in (A).

FIG. 19 . Entrapment of lipAR9 in negCry3Aa crystals. (a) SDS-PAGEanalysis of Cry3Aa[lipAR9] and negCry3Aa[lipAR9] crystals. (b)Quantitation of loading of lipAR9 in Cry3Aa and negCry3Aa crystals.

FIG. 20 . SDS-PAGE of Cry3Aa-[SmtA]₁ produced in Bt and purified bysucrose density gradient centrifugation. Lane 1: BioRad Precision plusmolecular weight marker; lane 2-3: Extracts from <40% sucrose layers;lane 4: crystal extracts from the interface of 40%/60% sucrose layer.

FIG. 21 . Structure of Cry3Aa crystal channel. Residues highlighted inred are regions exposed to the solvent channel and can be deleted toexpand the channel size.

FIG. 22 Cry3Aa protein sequence. Residues in bold are regions exposed tothe solvent channel and can be deleted to expand the channel size. (SEQID NO: 4).

FIG. 23 . Amino acids in specific regions exposed to the Cry3Aa (SEQ IDNO:4) crystal channel.

FIG. 24 . Entrapment of LmSP in 3A2-2 crystal. (a) Model of LmSP homologThermoanaerobacter thermosaccharolyticum 6F-phosphate phosphorylase (PDBID: 6S9V) inside the Cry3Aa crystal (PDB ID: 1DLC) channel. Loop regionsthat are oriented in the channel with potential steric clash with LmSPare colored red. LmSP is colored yellow with its amino acids colored inblack and portrayed as lines. Cry3Aa is colored blue. Model was producedin COOT and images were prepared in PyMol. (b) SDS-PAGE analysis afterentrapment of LmSP in the wild-type Cry3Aa crystal (3A[LmSP]) and in themutant 3A2-2 crystal (3A2-2[LmSP]).

FIG. 25 . Stabilizing mutations in Cry3Aa protein. (a) Structure of twoCry3Aa monomers showing His161-E168 electrostatic interaction. (b)Solubilization resistance of Cry3Aa mutants. (c) Structure of two Cry3Aamonomers showing potential site for disulfide formation. (d)Recyclability of PML^(VG)triple copy constructs 3A[PML^(VG)]³ and doublemutant 3ADM[PML^(VG)]³ during the synthesis of biodiesel from wastecooking oil.

FIG. 26 . Catch and release of lipAR9 by Cry3Aa crystals. (a)p-Nitrophenyl acetate hydrolysis activity of purified 3A[lipAR9]crystals. (b) Lipase activity of supernatant after release of lipAR9protein. 3A[lipAR9] crystals were incubated with increasingconcentrations of NaCl for 1 h and then removed from the mixture. Theincrease in lipase activity in the supernatant indicates that lipA wasreleased as a function of NaCl concentration.

FIG. 27 . Amino acid sequence alignment of NegCry3Aa (SEQ ID NO: 6)against other wild-type Cry proteins. (SEQ ID NOs: 13-20 respectively)

DEFINITIONS

The term “Cry protein,” as used herein, refers to any one protein amonga class of crystalline proteins produced by strains of Bacillusthuringiensis (Bt). Some examples of “Cry proteins” include, but are notlimited to, Cry1Aa, Cry1Ab Cry2Aa, Cry3Aa, Cry4Aa, Cry4Ba, Cry11Aa,Cry11Ba, and Cry19Aa. Their amino acid sequences and polynucleotidecoding sequences are known and can be found in publications such as U.S.Pat. Application published as US2010/0322977. Their GenBank AccessionNos are:

-   Cry1Aa AY197341.1-   Cry1Ab AY847289.1-   Cry2Aa AF273218.1-   Cry3Aa AJ237900.1-   Cry4Aa AB513706.1-   Cry4Ba AB161456.1-   Cry11A_(a) AL731825.1-   Cry11B_(a) LC153032.1-   Cry19AaY07603.1

In addition to the wild-type Cry proteins, the term “Cry protein” alsoencompasses functional variants, which (1) share an amino acid sequenceidentity of at least 80%, 81%, 82%, 83%, 84%, 88%, 89%, 90%, 91%, 92%,93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.5% to the polypeptide sequenceof any one of the Cry proteins listed in US2010/0322977; and (2) retainthe ability to spontaneously form crystals within host cells as can beconfirmed by known methods such as electron micrograph (see descriptionin, e.g., Park et al., Appl Environ Microbiol, 1998, 64, 3932-3938;Schnepf et al., Microbiol Mol Biol Rev, 1998, 62, 775-806; Whiteley andSchnepf, Annu Rev Microbiol, 1986, 40, 549-576; and Nair et al., PLoSOne, 2015, 10, e0127669). For example, a “Cry protein” encompasses anyvariant that confers increased negative charges to the resultant proteinby substitutions of at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,15, or more positively charged or neutral amino acids within the domainII of the wild-type Cry protein (e.g., the 295 to 499 segment of SEQ IDNO:4, as well as its corresponding segment in other Cry proteins asshown in FIG. 27 ) with negatively charged amino acids such as aspartate(D) and glutamate (E). One example of such mutant is NegCry3Aa, SEQ IDNO:6, which is a Cry3Aa variant containing the following mutations:K384E, N391D, N395D, S425E, Q430E, TQ436437EE, KR442443EE, T461D, andK467E. Further encompassed by the term “Cry protein” are modified Cryproteins containing amino acid insertions or deletions (for example,insertion of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids, or adeletion of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acids)at one or more of the 8 specific regions (for example, at least 1, 2, 3,4, 5, 6, 7, or all 8 regions) shown in FIG. 22 , as well as theircorresponding regions in other Cry proteins as shown in Figure 18 ofPCT/CN2020/084939, can effectively reduce or enlarge the protein’schannel size, respectively, and can therefore make the resultantcrystal-forming protein variant to more effectively entrap a targetprotein (such as an enzyme) with increased retention rate and reduceddiffusion rate for the target protein. Typically, such a channel-sizealtering Cry protein mutant contains at least 3, 4, or 5 but no morethan 10, 12, or 15 point mutations (deletions or insertions) in theentire protein. One example of a Cry3Aa mutant of this nature is termed3A2-2 (SEQ ID NO:7). In addition, the term “Cry protein” alsoencompasses Cry protein mutants that have been modified at one or moreresidues to improve the stability of the resultant protein or itscrystal structure, for example, introducing one or more cysteineresidues by insertion or substitution to the wild-type Cry protein aminoacid sequence so as to form additional disulfide bond or bonds tostabilize the protein or crystal structure. One such example is a doublemutant of Cry3Aa (3AS145C, H161R), having the amino acid sequence of SEQID NO: 9.

Similarly, a “crystal-forming fragment” of a Cry protein is a fragmentof any of the known Cry proteins (i.e., less than full length of thewild-type Cry protein) that still retains the ability ofself-crystallization, which is demonstrated both by crystallization bythe fragment alone and by causing a fusion protein to self-crystallizewhen the fragment is present in the fusion protein with another proteinof interest (e.g., an enzyme). In addition to being a truncated form ofa Cry protein, a “crystal-forming fragment” may further contain one ormore modifications to the native amino acid sequence such as insertions,deletions, or substitutions, especially conservative modifications, suchthat the resultant “crystal-forming fragment” shares an amino acidsequence identity of at least 80%, 81%, 82%, 83%, 84%, 88%, 89%, 90%,91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.5% to the polypeptidesequence of the corresponding fragment of a wild-type Cry protein.Exemplary crystal-forming fragments of a Cry protein have been describedin earlier disclosures, e.g., WO2018/028371.

The term “nucleic acid” or “polynucleotide” refers to deoxyribonucleicacids (DNA) or ribonucleic acids (RNA) and polymers thereof in eithersingle- or double-stranded form. Unless specifically limited, the termencompasses nucleic acids containing known analogues of naturalnucleotides that have similar binding properties as the referencenucleic acid and are metabolized in a manner similar to naturallyoccurring nucleotides. Unless otherwise indicated, a particular nucleicacid sequence also implicitly encompasses conservatively modifiedvariants thereof (e.g., degenerate codon substitutions), alleles,orthologs, SNPs, and complementary sequences as well as the sequenceexplicitly indicated. Specifically, degenerate codon substitutions maybe achieved by generating sequences in which the third position of oneor more selected (or all) codons is substituted with mixed-base and/ordeoxyinosine residues (Batzer et al., Nucleic Acid Res. 19:5081 (1991);Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); and Rossolini etal., Mol. Cell. Probes 8:91-98 (1994)). The term nucleic acid is usedinterchangeably with gene, cDNA, and mRNA encoded by a gene.

The term “gene” means the segment of DNA involved in producing apolypeptide chain. It may include regions preceding and following thecoding region (leader and trailer) as well as intervening sequences(introns) between individual coding segments (exons).

The term “amino acid” refers to naturally occurring and synthetic aminoacids, as well as amino acid analogs and amino acid mimetics thatfunction in a manner similar to the naturally occurring amino acids.Naturally occurring amino acids are those encoded by the genetic code,as well as those amino acids that are later modified, e.g.,hydroxyproline, γ-carboxyglutamate, and O-phosphoserine. Amino acidanalogs refers to compounds that have the same basic chemical structureas a naturally occurring amino acid, i.e., an α carbon that is bound toa hydrogen, a carboxyl group, an amino group, and an R group, e.g.,homoserine, norleucine, methionine sulfoxide, methionine methylsulfonium. Such analogs have modified R groups (e.g., norleucine) ormodified peptide backbones, but retain the same basic chemical structureas a naturally occurring amino acid. “Amino acid mimetics” refers tochemical compounds having a structure that is different from the generalchemical structure of an amino acid, but that functions in a mannersimilar to a naturally occurring amino acid.

There are various known methods in the art that permit the incorporationof an unnatural amino acid derivative or analog into a polypeptide chainin a site-specific manner, see, e.g., WO 02/086075.

Amino acids may be referred to herein by either the commonly known threeletter symbols or by the one-letter symbols recommended by the IUPAC-IUBBiochemical Nomenclature Commission. Nucleotides, likewise, may bereferred to by their commonly accepted single-letter codes.

“Conservatively modified variants” applies to both amino acid andnucleic acid sequences. With respect to particular nucleic acidsequences, “conservatively modified variants” refers to those nucleicacids that encode identical or essentially identical amino acidsequences, or where the nucleic acid does not encode an amino acidsequence, to essentially identical sequences. Because of the degeneracyof the genetic code, a large number of functionally identical nucleicacids encode any given protein. For instance, the codons GCA, GCC, GCGand GCU all encode the amino acid alanine. Thus, at every position wherean alanine is specified by a codon, the codon can be altered to any ofthe corresponding codons described without altering the encodedpolypeptide. Such nucleic acid variations are “silent variations,” whichare one species of conservatively modified variations. Every nucleicacid sequence herein that encodes a polypeptide also describes everypossible silent variation of the nucleic acid. One of skill willrecognize that each codon in a nucleic acid (except AUG, which isordinarily the only codon for methionine, and TGG, which is ordinarilythe only codon for tryptophan) can be modified to yield a functionallyidentical molecule. Accordingly, each silent variation of a nucleic acidthat encodes a polypeptide is implicit in each described sequence.

As to amino acid sequences, one of skill will recognize that individualsubstitutions, deletions or additions to a nucleic acid, peptide,polypeptide, or protein sequence which alters, adds or deletes a singleamino acid or a small percentage of amino acids in the encoded sequenceis a “conservatively modified variant” where the alteration results inthe substitution of an amino acid with a chemically similar amino acid.Conservative substitution tables providing functionally similar aminoacids are well known in the art. Such conservatively modified variantsare in addition to and do not exclude polymorphic variants, interspecieshomologs, and alleles of the invention.

The following eight groups each contain amino acids that areconservative substitutions for one another:

-   1) Alanine (A), Glycine (G);-   2) Aspartic acid (D), Glutamic acid (E);-   3) Asparagine (N), Glutamine (Q);-   4) Arginine (R), Lysine (K);-   5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V);-   6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W);-   7) Serine (S), Threonine (T); and-   8) Cysteine (C), Methionine (M)

(see, e.g., Creighton, Proteins, W. H. Freeman and Co., N. Y. (1984)).

Amino acids may be referred to herein by either their commonly knownthree letter symbols or by the one-letter symbols recommended by theIUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise,may be referred to by their commonly accepted single-letter codes.

In the present application, amino acid residues are numbered accordingto their relative positions from the left most residue, which isnumbered 1, in an unmodified wild-type polypeptide sequence.

As used in herein, the terms “identical” or percent “identity,” in thecontext of describing two or more polynucleotide or amino acidsequences, refer to two or more sequences or subsequences that are thesame or have a specified percentage of amino acid residues ornucleotides that are the same (for example, a Cry protein or acrystal-forming fragment of a Cry protein sequence comprised in thefusion protein of this invention has at least 80% identity, preferably85%, 90%, 91%, 92%, 93, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity,to a reference sequence, e.g., the amino acid sequence of acorresponding wild-type Cry protein or fragment), when compared andaligned for maximum correspondence over a comparison window, ordesignated region as measured using one of the following sequencecomparison algorithms or by manual alignment and visual inspection. Suchsequences are then said to be “substantially identical.” With regard topolynucleotide sequences, this definition also refers to the complementof a test sequence. Preferably, the identity exists over a region thatis at least about 50 amino acids or nucleotides in length, or morepreferably over a region that is 75-100 amino acids or nucleotides inlength.

For sequence comparison, typically one sequence acts as a referencesequence, to which test sequences are compared. When using a sequencecomparison algorithm, test and reference sequences are entered into acomputer, subsequence coordinates are designated, if necessary, andsequence algorithm program parameters are designated. Default programparameters can be used, or alternative parameters can be designated. Thesequence comparison algorithm then calculates the percent sequenceidentities for the test sequences relative to the reference sequence,based on the program parameters. For sequence comparison of nucleicacids and proteins, the BLAST and BLAST 2.0 algorithms and the defaultparameters discussed below are used.

A “comparison window”, as used herein, includes reference to a segmentof any one of the number of contiguous positions selected from the groupconsisting of from 20 to 600, usually about 50 to about 200, moreusually about 100 to about 150 in which a sequence may be compared to areference sequence of the same number of contiguous positions after thetwo sequences are optimally aligned. Methods of alignment of sequencesfor comparison are well-known in the art. Optimal alignment of sequencesfor comparison can be conducted, e.g., by the local homology algorithmof Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homologyalignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970),by the search for similarity method of Pearson & Lipman, Proc. Nat’l.Acad. Sci. USA 85:2444 (1988), by computerized implementations of thesealgorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin GeneticsSoftware Package, Genetics Computer Group, 575 Science Dr., Madison,WI), or by manual alignment and visual inspection, see, e.g., CurrentProtocols in Molecular Biology (Ausubel et al., eds. 1995 supplement).

Examples of algorithms that are suitable for determining percentsequence identity and sequence similarity are the BLAST and BLAST 2.0algorithms, which are described in Altschul et al., (1990) J. Mol. Biol.215: 403-410 and Altschul et al. (1977) Nucleic Acids Res. 25:3389-3402, respectively. Software for performing BLAST analyses ispublicly available at the National Center for Biotechnology Informationwebsite, ncbi.nlm.nih.gov. The algorithm involves first identifying highscoring sequence pairs (HSPs) by identifying short words of length W inthe query sequence, which either match or satisfy some positive-valuedthreshold score T when aligned with a word of the same length in adatabase sequence. T is referred to as the neighborhood word scorethreshold (Altschul et al., supra). These initial neighborhood word hitsacts as seeds for initiating searches to find longer HSPs containingthem. The word hits are then extended in both directions along eachsequence for as far as the cumulative alignment score can be increased.Cumulative scores are calculated using, for nucleotide sequences, theparameters M (reward score for a pair of matching residues; always >0)and N (penalty score for mismatching residues; always <0). For aminoacid sequences, a scoring matrix is used to calculate the cumulativescore. Extension of the word hits in each direction are halted when: thecumulative alignment score falls off by the quantity X from its maximumachieved value; the cumulative score goes to zero or below, due to theaccumulation of one or more negative-scoring residue alignments; or theend of either sequence is reached. The BLAST algorithm parameters W, T,and X determine the sensitivity and speed of the alignment. The BLASTNprogram (for nucleotide sequences) uses as defaults a word size (W) of28, an expectation (E) of 10, M=1, N=-2, and a comparison of bothstrands. For amino acid sequences, the BLASTP program uses as defaults aword size (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoringmatrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915(1989)).

The BLAST algorithm also performs a statistical analysis of thesimilarity between two sequences (see, e.g., Karlin & Altschul, Proc.Nat’l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarityprovided by the BLAST algorithm is the smallest sum probability (P(N)),which provides an indication of the probability by which a match betweentwo nucleotide or amino acid sequences would occur by chance. Forexample, a nucleic acid is considered similar to a reference sequence ifthe smallest sum probability in a comparison of the test nucleic acid tothe reference nucleic acid is less than about 0.2, more preferably lessthan about 0.01, and most preferably less than about 0.001.

An indication that two nucleic acid sequences or polypeptides aresubstantially identical is that the polypeptide encoded by the firstnucleic acid is immunologically cross reactive with the antibodiesraised against the polypeptide encoded by the second nucleic acid, asdescribed below. Thus, a polypeptide is typically substantiallyidentical to a second polypeptide, for example, where the two peptidesdiffer only by conservative substitutions. Another indication that twonucleic acid sequences are substantially identical is that the twomolecules or their complements hybridize to each other under stringentconditions, as described below. Yet another indication that two nucleicacid sequences are substantially identical is that the same primers canbe used to amplify the sequence.

“Polypeptide,” “peptide,” and “protein” are used interchangeably hereinto refer to a polymer of amino acid residues. All three terms apply toamino acid polymers in which one or more amino acid residue is anartificial chemical mimetic of a corresponding naturally occurring aminoacid, as well as to naturally occurring amino acid polymers andnon-naturally occurring amino acid polymers. As used herein, the termsencompass amino acid chains of any length, including full-lengthproteins, wherein the amino acid residues are linked by covalent peptidebonds.

The term “recombinant” when used with reference, e.g., to a cell, or anucleic acid, protein, or vector, indicates that the cell, nucleic acid,protein or vector, has been modified by the introduction of aheterologous nucleic acid or protein or the alteration of a nativenucleic acid or protein, or that the cell is derived from a cell somodified. Thus, for example, recombinant cells express genes that arenot found within the native (non-recombinant) form of the cell orexpress native genes that are otherwise abnormally expressed, underexpressed or not expressed at all.

An “expression cassette” is a nucleic acid construct, generatedrecombinantly or synthetically, with a series of specified nucleic acidelements that permit transcription of a particular polynucleotidesequence in a host cell. An expression cassette may be part of aplasmid, viral vector derived from a viral genome, or nucleic acidfragment/construct. Typically, an expression cassette includes apolynucleotide to be transcribed, operably linked to a promoter. Otherelements that may be present in an expression cassette include thosethat enhance transcription (e.g., enhancers) and terminate transcription(e.g., terminators), as well as those that confer certain bindingaffinity or antigenicity to the recombinant protein produced from theexpression cassette.

A “promoter” is defined as an array of nucleic acid control sequencesthat direct transcription of a polynucleotide sequence. As used herein,a promoter includes necessary polynucleotide sequences near the startsite of transcription, such as, in the case of a polymerase II typepromoter, a TATA element. A promoter also optionally includes distalenhancer or repressor elements, which can be located as much as severalthousand base pairs from the start site of transcription. A“constitutive” promoter is a promoter that is active under mostenvironmental and developmental conditions. An “inducible” promoter is apromoter that is active under environmental or developmental regulation.The term “operably linked” refers to a functional linkage between apolynucleotide expression control sequence (such as a promoter, or arrayof transcription factor binding sites) and a second polynucleotidesequence, wherein the expression control sequence directs transcriptionof the polynucleotide sequence corresponding to the second sequence.

The term “heterologous” as used in the context of describing therelative location of two elements, refers to the two elements such aspolynucleotide sequences (e.g., a promoter or aprotein/polypeptide-encoding sequence) or polypeptide sequences (e.g., aCry protein sequence or another polypeptide sequence) that are notnaturally found in the same relative positions. Thus, a “heterologouspromoter” of a gene refers to a promoter that is not naturally operablylinked to that gene. Similarly, a “heterologous polypeptide” or“heterologous polynucleotide” to a Cry protein or its encoding sequenceor a fragment thereof is one derived from an origin other than the Cryprotein or, in the case of a fragment of a Cry protein/coding sequence,may be derived from another part of the same Cry protein or codingsequence, but not naturally connected to the fragment in the samefashion. The fusion of a fragment of a Cry protein (or its codingsequence) with a heterologous polypeptide (or polynucleotide sequence)does not result in a longer polypeptide or polynucleotide sequence thatcan be found naturally in the wild-type Cry protein.

By “host cell” is meant a cell that contains an expression vector andsupports the replication or expression of one or more coding sequencesharbored in the expression vector. Host cells may be prokaryotic cellssuch as Bacillus thuringiensis (Bt), Bacillus subtilis (Bs), or E. coli,or eukaryotic cells such as yeast, insect, amphibian, or mammalian cellssuch as CHO, HeLa and the like, e.g., cultured cells, explants, andcells in vivo.

The term “about” as used herein denotes a range of +/- 10% of areference value. For examples, “about 10” defines a range of 9 to 11.

DETAILED DESCRIPTION OF THE INVENTION I. Introduction

There has been growing interest in using enzymes to catalyze industrialreactions due to their high reactivity, excellent regio- andenantiospecificity, and low environmental toxicity. In order tofinancially compete with chemical catalysis, biocatalysts are optimizedso they can be recycled multiple times. Additionally, biocatalysts aregenerally optimized so they can withstand high concentrations of organicsolvents - conditions that can promote substrate solubility and enzymeactivity. Enzyme immobilization can facilitate easy removal andsubsequent reuse of enzymes during multiple rounds of catalysis, inaddition to improving enzyme stability resistant to harsh industrialconditions. Immobilization by entrapment is particularly attractivesince it does not involve any modifications to the enzyme structure,increasing the chance for high activity retention and native enzymeconformation. In this disclosure, a novel one-step method is describedfor producing entrapped recombinant proteins (especially enzymes,including multi-enzymes) in protein crystals.

To date, all entrapment methods involve first producing the enzyme andcarrier separately, and then mixing them to generate the immobilizedenzyme. This multi-step process requires purifying and concentrating theenzyme, synthesizing the carrier, and then mixing them, usuallyincluding the use of a catalyst to initiate the polymerization of thecarrier. This procedure is tedious and expensive. A method thatgenerates an entrapped enzyme or multi-enzyme system in one-step processcan significantly reduce production costs and lead to cheaper commercialcatalysts. The method of this invention leads to the entrapped enzyme inone step, avoiding the need for purification of the free enzyme orpolymerization of the carrier. Minimizing time, processes, and materialsfor producing enzyme catalysts allows for greener and morecost-effective products.

II. Production of Recombinant Proteins A. General Recombinant Technology

Basic texts disclosing general methods and techniques in the field ofrecombinant genetics include Sambrook and Russell, Molecular Cloning, ALaboratory Manual (3rd ed. 2001); Kriegler, Gene Transfer andExpression: A Laboratory Manual (1990); and Ausubel et al., eds.,Current Protocols in Molecular Biology (1994).

For nucleic acids, sizes are given in either kilobases (kb) or basepairs (bp). These are estimates derived from agarose or acrylamide gelelectrophoresis, from sequenced nucleic acids, or from published DNAsequences. For proteins, sizes are given in kilodaltons (kDa) or aminoacid residue numbers. Proteins sizes are estimated from gelelectrophoresis, from sequenced proteins, from derived amino acidsequences, or from published protein sequences.

Oligonucleotides that are not commercially available can be chemicallysynthesized, e.g., according to the solid phase phosphoramidite triestermethod first described by Beaucage & Caruthers, Tetrahedron Lett. 22:1859-1862 (1981), using an automated synthesizer, as described in VanDevanter et. al., Nucleic Acids Res. 12: 6159-6168 (1984). Purificationof oligonucleotides is performed using any art-recognized strategy,e.g., native acrylamide gel electrophoresis or anion-exchange HPLC asdescribed in Pearson & Reanier, J. Chrom. 255: 137-149 (1983).

The sequence of a gene of interest, such as the polynucleotide sequenceencoding an enzyme like lipase or hydrolase, a polynucleotide encoding aCry protein or a crystal-forming fragment thereof, and syntheticoligonucleotides can be verified after cloning or subcloning using,e.g., the chain termination method for sequencing double-strandedtemplates of Wallace et al., Gene 16: 21-26 (1981).

B. Coding Sequences

Polynucleotide sequences encoding Cry proteins, fragments, or fusionproteins for use in this invention can be readily constructed by usingthe corresponding coding sequences for the Cry proteins, fragments, orcombining the coding sequences for the fusion partners, such as a Cry3Aaprotein and Bacillus subtilis lipase A (lipA). The sequences for Cryproteins and enzymes are generally known and may be obtained from acommercial supplier.

In addition to the use of full length wild-type Cry proteins forproducing crystal-forming proteins for use in this invention, fragmentsof Cry proteins and/or variants of Cry proteins may also be useful. ADNA sequence encoding a Cry protein can be modified to generatefragments or variants of the Cry protein. So long as the fragments andvariants retain the ability to spontaneously form crystals whenexpressed in a host cell, especially a Bacillus bacterial cell, they canbe used for producing the protein crystals, either by themselves or byway of fusion proteins capable of undergoing spontaneouscrystallization, and therefore producing protein crystals containing oneor more recombinant proteins (e.g., one or more enzymes) embeddedwithin. Typically, the variants bear a high percentage of sequenceidentity (e.g., at least 80, 85, 90, 95, 97, 98, 99% or higher) to thewild-type Cry protein sequence, whereas the fragments may besubstantially shorter than the full length Cry protein, such as havingsome amino acids (e.g., 10-300 or 20-200 or 50-100 amino acids) removedfrom the N- or C-terminus of the full length Cry protein. For example, auseful Cry3Aa fragment may be as short as the first 290 amino acids fromthe N-terminus, encompassing Domain I of the protein. Other examples ofsuch fragments include a Cry protein fragment having its first 57 aminoacids from N-terminus removed and a Cry protein fragment having itsC-terminal 18 amino acids removed. The ability of a recombinantlyproduced Cry protein, a fragment thereof, or a fusion protein comprisinga Cry protein or fragment to undergo spontaneous crystallization can beverified by electron micrograph, whereas the enzymatic activity of arecombinantly produced enzyme, including in the form of a fusion proteinwith a Cry protein or a fragment thereof, can be confirmed byestablished assays for each specific enzyme. Exemplary Cry proteinfragments capable of self-crystalizing can be found in the inventors’earlier publications, e.g., WO2018/028371.

In the case of a fusion protein, a peptide linker or spacer is usedbetween the coding sequences for a Cry protein/fragment and one or moreheterologous polypeptides. One purpose is to ensure the proper readingframe for the fusion protein such that the coding sequences for both Cryprotein/fragment and the heterologous polypeptide(s) are in frame.Another purpose is to provide appropriate spatial relationship betweenthe Cry protein/fragment and the heterologous polypeptide(s), such thateach component of the fusion protein may retain its originalfunctionality: the Cry protein/fragment is able to causeself-crystallization of the fusion protein, and the heterologouspolypeptide such as an enzyme remains active in its catalytic capacity.Also, one or more linkers may be placed at the very beginning and/or thevery end of the open reading frame, so as to facilitate proper start andtermination of the coding sequence translation. Such linkage amino acidsequences are usually shorts and typically no longer than 100 or 50amino acids, such as between 1 to 100, 1 or 2 to 50, 2 or 3 to 25, 3 or4 to 10 amino acids.

C. Sequence Modification for Preferred Codon Usage in a Host Organism

The polynucleotide sequence encoding a recombinant protein to beexpressed according to the method of this invention can be furtheraltered to coincide with the preferred codon usage of a particular host.For example, the preferred codon usage of one strain of bacterial cellscan be used to derive a polynucleotide that encodes a recombinantpolypeptide of the invention and includes the codons favored by thisstrain. The frequency of preferred codon usage exhibited by a host cellcan be calculated by averaging frequency of preferred codon usage in alarge number of genes expressed by the host cell (e.g., calculationservice is available from web site of the Kazusa DNA Research Institute,Japan). This analysis is preferably limited to genes that are highlyexpressed by the host cell.

At the completion of modification, the coding sequences are verified bysequencing and are then subcloned into an appropriate expression vectorfor recombinant production of a protein (e.g., an enzyme such as alipase) along with a Cry protein, a crystal-forming fragment of the Cryprotein, or a fusion protein comprising the Cry protein orcrystal-forming fragment, such that the protein (e.g., enzyme) isproduced and embedded in the protein crystals formed by the Cry protein,fragment, or fusion protein.

III. Expression and Isolation of Proteins Embedded in Protein Crystals

Following verification of the coding sequence, co-expression of aprotein of interest (such as an enzyme) and a Cry protein, acrystal-forming fragment thereof, or a fusion protein comprising the Cryprotein or fragment of this invention can be produced using routinetechniques in the field of recombinant genetics, relying on thepolynucleotide sequences encoding the Cry fusion protein disclosedherein.

A. Expression Systems

To obtain high level expression of a polynucleotide sequence encoding arecombinant protein, one typically subclones a polynucleotide encodingthe protein in the correct reading frame into an expression vector thatcontains a strong promoter to direct transcription, atranscription/translation terminator and a ribosome binding site fortranslational initiation. Suitable bacterial promoters are well known inthe art and described, e.g., in Sambrook and Russell, supra, and Ausubelet al., supra. Bacterial expression systems for expressing thepolypeptide are available in, e.g., E. coli, Bacillus sp., Salmonella,and Caulobacter. Kits for such expression systems are commerciallyavailable.

The promoter used to direct expression of a recombinant protein dependson the particular host cells used for the recombinant proteinproduction. For instance, for effective expression in a Bacillusbacterial strain such as Bacillus thuringiensis (Bt) or Bacillussubtilis (Bs) cells, a promoter known to direct robust proteinexpression in these particular bacterial cells should be chosen. Asshown in the Examples of this disclosure, two separate promoters, cytlAaand cry3Aa, have been successfully used to direct the expression of arecombinant protein as well as a Cry protein. The promoter is optionallypositioned about the same distance from an exogenous or recombinantprotein transcription start site as it is from the transcription startsite in its natural setting. As is known in the art, however, somevariation in this distance can be accommodated without loss of promoterfunction. In some cases, a constitutive promoter is used, whereas inother cases an inducible promoter rather than a constitutive promoter ispreferred. Further, the placement of a common promoter or multipleseparate promoters (which may be the same or different kind ofpromoters) and/or the placement of one or more separate ribosome bindingsites separating different segments of coding sequences (each encoding arecombinant protein or a crystal-forming protein) in a common expressioncassette (e.g., an expression vector such as a plasmid) can allowdifferent expression ratios of the recombinant protein(s) to thecrystal-forming protein so as to maximize the percentage of therecombinant protein(s) being entrapped or immobilized within the proteincrystals formed by the crystal-forming protein such as a Cry protein, acrystal-forming fragment thereof, or a fusion protein comprising the Cryprotein or the fragment.

In addition to the promoter, the expression vector typically includes atranscription unit or expression cassette containing all the additionalelements that are required for the expression of the recombinantprotein(s) and the crystal-forming protein in host cells. A typicalexpression cassette thus contains a promoter operably linked to thepolynucleotide sequence encoding the recombinant protein and/orcrystal-forming protein and signals required for efficientpolyadenylation of the transcript, ribosome binding sites, andtranslation termination. The polynucleotide coding sequence may belinked to a cleavable signal peptide sequence to promote secretion ofthe polypeptide by the transformed cell. Such signal peptides include,among others, the signal peptides from tissue plasminogen activator,insulin, and neuron growth factor, and juvenile hormone esterase ofHeliothis virescens. Additional elements of the cassette may includeenhancers and, if genomic DNA is used as the structural gene, intronswith functional splice donor and acceptor sites.

In addition to a promoter sequence, the expression cassette should alsocontain a transcription termination region downstream of the codingsequence to provide for efficient termination. The termination regionmay be obtained from the same gene as the promoter sequence or may beobtained from different genes. The placement of a commonly sharedtermination site in combination with the placement of multiple promotersdirecting transcription of multiple coding sequences can also be used toadjust the expression ratios of a recombinant protein to acrystal-forming protein, see, e.g., illustration in FIG. 4 .

The particular expression vector used to transport the geneticinformation into the cell is not particularly critical. Any of theconventional vectors used for expression in eukaryotic or prokaryoticcells may be used, especially those suitable for expression in cells ofBacillus sp. such as Bt and Bs. Standard bacterial expression vectorsinclude plasmids such as pBR322 based plasmids, pSKF, pET23D, and fusionexpression systems such as GST and LacZ.

The elements that are typically included in expression vectors alsoinclude a replicon that functions in bacteria such as Bacillus sp. andE. coli, a gene encoding antibiotic resistance to permit selection ofbacteria that harbor recombinant plasmids, and unique restriction sitesin nonessential regions of the plasmid to allow insertion of codingsequences. The particular antibiotic resistance gene chosen is notcritical, any of the many resistance genes known in the art aresuitable. Similar to antibiotic resistance selection markers, metabolicselection markers based on known metabolic pathways may also be used asa means for selecting transformed host cells.

B. Transfection Methods

Standard transfection methods are used to produce bacterial, mammalian,yeast, insect, or plant cell lines that express large quantities of arecombinant fusion protein of this invention, which are then purifiedusing standard techniques (see, e.g., Colley et al., J. Biol. Chem. 264:17619-17622 (1989); Guide to Protein Purification, in Methods inEnzymology, vol. 182 (Deutscher, ed., 1990)). Transformation ofeukaryotic and prokaryotic cells are performed according to standardtechniques (see, e.g., Morrison, J. Bact. 132: 349-351 (1977);Clark-Curtiss & Curtiss, Methods in Enzymology 101: 347-362 (Wu et al.,eds, 1983).

Any of the well-known procedures for introducing foreign polynucleotidesequences into host cells may be used. These include the use of calciumphosphate transfection, polybrene, protoplast fusion, electroporation,liposomes, microinjection, plasma vectors, viral vectors and any of theother well-known methods for introducing cloned genomic DNA, cDNA,synthetic DNA, or other foreign genetic material into a host cell (see,e.g., Sambrook and Russell, supra). It is only necessary that theparticular genetic engineering procedure used be capable of successfullyintroducing at least one gene into the host cell capable of expressingthe recombinant protein(s) along with a crystal-forming protein (e.g., aCry protein, a crystal-forming fragment thereof, or a fusion proteincomprising a Cry protein or a crystal-forming fragment thereof)according to the method of this invention. As described above andherein, the recombinant protein or proteins (e.g., enzyme or enzymes)and the crystal-forming protein may be contained within one singleexpression cassette, e.g., in the same expression vector, where allcoding sequences may be under the control of one single promoter, oreach coding sequence may be under the control of a separate promoter(which optionally may differ from one another). In the alternative, eachone of the coding sequences may be present in a separate expressioncassette (e.g., expression vector). In either alternative, differentratios of the recombinant protein(s) to the crystal-forming protein maybe achieved by using single copy or multiple copies of any one codingsequence or by placing separate or commonly shared ribosome bindingsite(s) and/or termination site(s) in the expression cassette, see,e.g., illustration in FIG. 4 .

C. Isolation of Crystal-Entrapped Recombinant Proteins

Once the expression of the recombinant protein(s) along with acrystal-forming protein (e.g., a Cry protein, a crystal-forming fragmentthereof, or a fusion protein comprising a Cry protein or acrystal-forming fragment thereof) in transfected host cells isconfirmed, e.g., via electron micrograph for detecting protein crystalsor an immunoassay such as Western blotting analysis, the host cells arethen cultured in an appropriate scale for the purpose of purifying orisolating the recombinant protein entrapped within the protein crystalsformed by the Cry protein, a crystal-forming fragment thereof, or afusion protein comprising a Cry protein or a crystal-forming fragmentthereof.

When the recombinant protein(s) and the crystal-forming protein areproduced recombinantly by transformed bacteria in large amounts, forexample after promoter induction, the recombinant protein(s) becomeentrapped within the crystals formed by the crystal-forming protein. Inother words, the recombinantly produced proteins are present incrystalline form or insoluble aggregates within the host cells. Thus,one can readily isolate the crystals from the cell lysate based on theirdistinct density by utilizing techniques such as centrifugation anddensity gradient separation followed by one or more rinsing steps tofurther remove contaminants from the protein crystals.

There are several protocols that are suitable for purification ofprotein inclusion bodies. For example, purification of aggregateproteins (hereinafter referred to as inclusion bodies) typicallyinvolves the extraction, separation and/or purification of inclusionbodies by disruption of bacterial cells, e.g., by incubation in a bufferof about 100-150 µg/ml lysozyme and 0.1% Nonidet P40, a non-ionicdetergent. The cell suspension can be ground using a Polytron grinder(Brinkman Instruments, Westbury, NY). Alternatively, the cells can besonicated on ice. Additional methods of lysing bacteria are described inAusubel et al. and Sambrook and Russell, both supra, and will beapparent to those of skill in the art.

The cell suspension is generally centrifuged and the pellet containingthe inclusion bodies resuspended in buffer which does not dissolve butwashes the inclusion bodies, e.g., 20 mM Tris-HCl (pH 7.2), 1 mM EDTA,150 mM NaCl and 2% Triton-X 100, a non-ionic detergent. It may benecessary to repeat the wash step to remove as much cellular debris aspossible. The remaining pellet of inclusion bodies may be resuspended inan appropriate buffer (e.g., 20 mM sodium phosphate, pH 6.8, 150 mMNaCl). Other appropriate buffers will be apparent to those of skill inthe art.

Upon isolation, the recombinant protein(s) recovered from host cells inthe form of protein crystals, the protein or proteins may be directlyused according to their inherent biological activity: for example, alipase entrapped in Cry protein crystals may be used in a reaction tohydrolyze triglycerides. By virtue of being in an insoluble crystalform, the lipase has a heighted level of resistance to harshenvironmental conditions such as high temperature, extreme pHs, organicsolvents, etc., thus allowing repeated cycles of cleaning and reuse.

In the alternative, following the washing step, the inclusion bodies aresolubilized to release the entrapped recombinant protein(s) by theaddition of a solvent that is both a strong hydrogen acceptor and astrong hydrogen donor (or a combination of solvents each having one ofthese properties). The protein(s) from the inclusion bodies may then berenatured by dilution or dialysis with a compatible buffer. Suitablesolvents include, but are not limited to, urea (from about 4 M to about8 M), formamide (at least about 80%, volume/volume basis), and guanidinehydrochloride (from about 4 M to about 8 M). Some solvents that arecapable of solubilizing aggregate-forming proteins, such as SDS (sodiumdodecyl sulfate) and 70% formic acid, may be inappropriate for use inthis procedure due to the possibility of irreversible denaturation ofthe proteins, accompanied by a lack of immunogenicity and/or activity.Although guanidine hydrochloride and similar agents are denaturants,this denaturation is not irreversible and renaturation may occur uponremoval (by dialysis, for example) or dilution of the denaturant,allowing re-formation of the immunologically and/or biologically activeprotein of interest. After solubilization, the protein(s) can beseparated from other bacterial proteins by standard separationtechniques. For further description of purifying recombinantpolypeptides from bacterial inclusion body, see, e.g., Patra et al.,Protein Expression and Purification 18: 182-190 (2000).

While the protein crystals tend to remain insoluble at lower or neutralpHs, placing them in alkaline solutions with pH at or greater than 10 or11 can often effectively dissolve the protein. Once dissolved, theprotein can then be analyzed by gel separation (e.g., on an SDS gel) andimmunoassays to confirm its identity based on the appropriate molecularweight and immunoreactivity.

IV. Applications of Immobilized Proteins

Another aspect of the present invention relates to the use of arecombinant protein, especially an enzyme, entrapped and immobilized inprotein crystals produced according to the methods described herein toexert the protein’s inherent biological activity, for example, toperform reactions typically catalyzed by the enzyme present in theprotein crystals, such as hydrolysis, esterification, ligation,proteolysis, and the like. As organic solvents are often able tofacilitate such reactions and the immobilized recombinant proteinproduced by the method of this invention is highly tolerant to thepresence of organic solvents, a reaction performed using the immobilizedprotein according to this invention often not only a water-based solventbut also one or more organic solvents, e.g., ethanol, methanol,acetonitrile, and dimethylformamide.

As the inventors discovered that immobilization of recombinantprotein(s) within the crystalline Cry protein or fusion proteins leadsthe protein(s) to have a higher level of resistance to organic solventsand a higher level of thermostability, potentially can retain enzymaticactivity for use in more cycles of reactions. In some cases, thisreaction process includes a cleaning step, performed after thecompletion of one round of the reaction and removal of the reactionproduct(s) as well as any remaining substrate, during which the proteincrystals containing recombinant protein(s) are rinsed or washed inpreparation of being used again with fresh substrate in a subsequentround of reaction.

EXAMPLES

The following examples are provided by way of illustration only and notby way of limitation. Those of skill in the art will readily recognize avariety of non-critical parameters that could be changed or modified toyield essentially the same or similar results.

Example 1 Introduction Immobilization of Enzymes

The major bottleneck to the practical use of enzymes as industrialcatalysts is the cost of production and useable lifetime. Whilebiomolecular engineering methods can be used to enhance enzymestability, these techniques do not aid in rapid production andisolation, particularly when high purities are necessary. Acost-effective biocatalyst should be able to be recycled over manyreaction cycles, with minimal loss of activity. One method to improveenzyme stability and make them reusable is immobilization, either bycovalent attachment to beads, or by adsorption into porous materials.Immobilization of enzymes allows them to be easily filtered and reusedover successive reaction cycles, and in some cases can lead to increasedtemperature stability and/or organic solvent tolerance due to thereduced conformational flexibility of the enzyme in the matrix¹. Thisrigidity is also beneficial to mechanical stability by helping toprotect against the constant agitation that occurs in large-scalereactors.²

A major limitation of both bead attachment and porous materialadsorption approaches, however, is that ~90-99% of the biocatalystcomposite is inactive, greatly reducing the catalytic productivity perweight.³ Furthermore, these approaches require multiple steps: (1)production and purification of the enzyme catalyst, (2) production ofthe support, and (3) anchoring of the enzyme catalyst on the support.These steps can add significantly to the cost and impact catalyticactivity. If a strategy can be developed to generate and isolate animmobilized lipase in a single step, it can dramatically lower the costof the catalyst.

Genetically-Encoded Immobilized Lipases

Genetically-encoded, or in-vivo, immobilization approaches that involvethe direct production of immobilized enzymes in bacterial cells,represents a promising direction for more efficient and economicalbiocatalyst production. Consolidating expression and immobilization intoa single step removes the need of columns for enzyme purification, aswell as reagents to introduce reactive chemistries, greatly reducingtime and production costs.

One of the earliest reports of producing active enzyme particles incells came from the work of Worrall et al., who showed that they couldproduce catalytically active inclusion bodies (CatIBs) ofβ-galactosidase in E. coli. ⁴ This discovery upended the paradigm thatinclusion bodies (IBs) were inactive waste species in cells and promptedthe idea that IBs could be used as immobilized catalysts. Since mostenzymes are naturally soluble, producing CatIBs can be a challenge. Thusmost approaches to produce in vivo immobilized enzymes have involved theincorporation of a fusion-tag to promote aggregation.

Several fusion tags have been exploited to generate immobilized enzymesin vivo including self-aggregating peptides and protein domains. Peptidetags such as ELK16 and L₆KD have been successfully fused to targetenzymes to drive in vivo aggregation. While the use of tags resulted inthe formation of highly active enzyme aggregates, none of the theseaggregates was shown to be reusable.^(5,6) Similarly, several proteindomains have been used to promote in vivo enzyme immobilization incells.⁷⁻⁹ The most successful example has been the work of Diener et al.who used the short coiled-coil domain TDoT of the cell-surface proteintetrabrachion from the hyperthermophilic archaeon Staphylothermusmarinus to immobilize several enzymes. ^(10,11) While enzymesimmobilized by the TDoT tag displayed good recyclability, the aggregatesproduced generally had low activity due to diffusion limitations.Identifying a platform capable of promoting the production of in vivoimmobilized enzymes with both high activity and good reusability haduntil recently, remained elusive.

Cry-Enzyme Fusion Proteins as Genetically-Encoded Immobilized Catalysts

Recently a new strategy was developed to produce genetically-encodedimmobilized enzymes based on crystal proteins (Cry), such as Cry3Aa. Cryproteins are insecticidal proteins that innately form crystals insidethe bacterium Bacillus thuringiensis (Bt) (FIG. 3 ).¹²⁻ ¹⁴ It waspreviously demonstrated that expression of Cry3Aa fused to GFP ormCherry yielded fluorescent protein crystals in Bt - indicating that thefusion partners were folded and functional.¹⁵ Electron microscopycarried out on Cry3Aa crystals and Cry3Aa-GFP crystals revealed them tobe similar in size, indicating that the fusion partner does not changethe crystal packing (FIGS. 3 b,c ).¹⁵

The use of this Cry crystal platform to produce genetically-immobilizedlipases as potential biodiesel catalysts was explored, see, e.g.,WO2018/028371. The first-generation biodiesel catalyst was generated bygenetically fusing Bacillus subtilis lipase A (lipA) to the C-terminusof a truncated Cry3Aa (Cry3Aa*) and then producing the fusion protein inBt. The resultant Cry3Aa*-lipA fusion crystals exhibited highercatalytic activities compared to free lipA, as well as improved solventand temperature stability.¹⁶ Significantly, the Cry3Aa*-lipA fusioncrystals can be used to produce biodiesel from coconut oil with highconversion efficiency over 10 reaction cycles (FIG. 4 ). Theseproperties made the Cry3Aa*-lipA catalyst the first example of agenetically-encoded immobilized enzyme that retained both high activityand recyclability.

Cry3Aa as a Carrier for Enzyme Entrapment

A potential limitation of using Cry3Aa-fusion crystals, however, is thatthe enzyme of interest is chemically modified at its N-terminus, whichcan hamper enzyme activity. Moreover, fusion to a bulky Cry3Aa proteinmay cause problems to the enzyme during folding. Indeed, Cry3Aa-fusionenzymes of a Proteus mirabilis lipase have been generated, which onlyretained 5% activity of retention compared to wild-type. This motivatedfurther efforts to explore novel approaches to produced immobilizedenzymes using Cry3Aa.

Cry3Aa proteins crystals contain large porous 50 Å × 50 Å channels. Itis envisaged that these channels can act as cages to trap the enzyme ofinterest inside (FIG. 3 ). Since Cry3Aa forms a crystal the targetenzyme should be spatially confined and ordered such that each monomerdoes not interfere with the other. Moreover, since the unit celldimensions of Cry3Aa are known, one can quantify the amount of enzymemolecules loaded into the Cry3Aa crystals, a characteristic can bemeasured using the Cry3Aa-fusion crystals based on the structuralinformation. Most importantly, this strategy retains the feature of aone-step immobilization platform, and the entrapped enzyme is much morelikely to be in its native state since there are no covalentmodifications.

Results and Discussions Entrapment of PML Inside Cry3Aa Crystals

Proteus mirabilis lipase (PML) is a bacterial lipase capable conversionof vegetable oils including waste cooking oil to biodiesel. PML ishighly soluble and expresses well in E. coli. The potential of PML fusedto Cry3Aa was recently explored as a catalyst for biodiesel productionfrom canola oil and waste cooking oil, and an optimized mutantcontaining two-point mutations (I118V, E130G, PML^(VG)) was produced.PML is a good target for Cry3Aa entrapment since the fusion of PML toCry3Aa drastically reduces the activity. Additionally, the size of PML(55 Å × 41 Å × 26 Å) is a good fit for the Cry3Aa channel (FIG. 1 ). Forclarity, square brackets ([and]) are used to indicate the entrapment ofthe PML molecules (e.g., Cry3Aa[PML]), whereas a hyphen (-) indicates agenetic fusion (e.g., Cry3Aa-PML).

Several plasmid variations were generated to improve the amount of PMLloaded into the crystals. A list of these constructs is displayed inFIG. 4 . The initial Cry3Aa[PML] coexpression cassette was generated bycloning the PML gene containing a separate ribosome binding sitedownstream of Cry3Aa. This plasmid was transformed, expressed and theresulting crystals were purified as described previously.¹⁵Solubilization of the crystals and analysis by SDS-PAGE indicated thatthe Cry3Aa crystals contained PML (FIG. 5 a ).

Several methods were tried to improve the PML loading into the Cry3Aacrystals. First, Cry3Aa was put under the control of two cytlAapromoters¹⁷ and PML under one cry3Aa promoter (3PPML^(VG)[Cyt13A]). Itwas anticipated that the temporal difference between the promoters(cyt1Aa is activated during late-stage sporulation while cry3Aa isactivated during vegetation and early-stage sporulation) would cause PMLto accumulate in the cell first and increase the relative ratio of PMLto Cry3Aa during crystallization. Although PML entrapment was achieved,the loading was much lower than Cry3Aa[PML^(VG)] (FIG. 5 a ).

Another approach was based on increasing the ratio of PML to Cry3Aa inthe transcript. This was accomplished implementing two strategies.First, by inserting another copy of PML downstream of Cry3Aa[PML^(VG)]making Cry3Aa[PML^(VG)]². Since the proteins are under one promoter,each transcript will have double the amount of PML per Cry3Aa. Second,by putting Cry3Aa and PML under two separate cry3Aa promoters but onlyone terminator (3P3A[3PPML^(VG)]). In the latter case, for eachtranscript containing Cry3Aa there will be two transcripts containingPML. Each of these strategies should result in a theoretical 2:1PML:Cry3Aa ratio. Both constructs resulted in higher loading of PML intothe Cry3Aa crystals, albeit 3P3A[3PPML^(VG)] demonstrated the highestloading efficiency at 64-75% depending on the batch (FIG. 5 b ). Thiscorresponds to around 4 × 10⁵ to 6 × 10⁵ molecules of PML^(VG) percrystal, which is unprecedented in terms of enzyme loading. Notably,these crystals retained near identical morphology and size to nativeCry3Aa crystals (FIG. 6 ).

The activities of all the resulting PML entrapped Cry3Aa crystals weredetermined by measuring the hydrolysis of p-nitrophenyl palmitate. Asdemonstrated in FIG. 7 , higher PML loading into the crystals resultedin higher activity per µg crystal, with the 3P3A3P[PML^(VG)] constructdisplaying the highest activity. Moreover, entrapment of PML^(VG) insideCry3Aa crystals significantly stabilized it against thermaldenaturation, displaying a 200-fold increase in residual activity afterincubation for 1 h incubation at 58° C. (FIG. 8B).

With the optimized immobilized lipase at hand, the practicality of thecatalyst for industrial reactions was tested. Two common applications oflipases are biodiesel production (Scheme 1 in FIG. 16A) and estersynthesis in organic solvents such as the cosmetic compound benzyllaurate (Scheme 2 in FIG. 16A).

The optimized Cry3Aa[PML^(VG)] crystals were capable of converting wastecooking oil into biodiesel in 10 h and retained maximal conversion for 7reaction cycles (FIG. 8A). These data support the high catalyticactivity and stability of the immobilized catalyst for biodieselproduction.

For benzyl laurate synthesis, the optimized Cry3Aa[PML^(VG)] crystalswere lyophilized in water for 24 h. The dried crystals were then reactedwith 10 mM benzyl alcohol and 10 mM lauric acid in neat isooctane for 24h. Gas chromatography-mass spectrometry (GCMS) analysis of the reactionmixture revealed that benzyl laurate was produced (FIG. 9 ). While thisreaction has not been optimized to maximize yield, it demonstrates thatthe Cry3Aa entrapped crystals can be used in neat organic solvents forsynthesis reactions.

As further demonstration that PML can be entrapped within Cry3Aacrystals, in vitro co-crystallization of carboxy-rhodamine-labelled PMLand Cry3Aa was performed by vapor diffusion (FIG. 10 ). Since onlyCry3Aa exhibits fluorescence upon UV (280 nm) irradiation, and PML isdye-labelled, formation of Cry3Aa-entrapped PML crystals could beconfirmed based on their fluorescent properties (FIG. 10 ). Notably,these data demonstrate that Cry3Aa-mediated entrapment can also beperformed in vitro.

Coentrapment of Multiple Proteins Inside Cry3Aa Crystals

One major advantage of using enzyme over chemical catalysis is thatenzymes are specific, allowing for multiple reactions to occur inone-pot without unwanted side reactions or side products. It washypothesized using the expression platform one could entrap multipleenzymes simultaneously in a single crystal. Notably, this would offer asignificant advantage over other multi-enzyme systems since there is noneed to purify the enzymes separately. As a proof of concept, mCherry,green fluorescent protein (GFP) and Cry3Aa were incorporated on the sameplasmid and expressed in Bt. Cry3Aa was expressed under one Cry3Aapromoter, while GFP and mCherry were expressed on a separate Cry3Aapromoter separated by a ribosome binding site. Crystals producedcontained both red and green fluorescence indicating co-entrapment ofboth proteins occurred in a single crystal (FIG. 11 ).

Co-immobilizing two or more enzymes on a single platform can bebeneficial to either impart the catalyst with a broader substratespecificity or generate a coupled system that promotes the synthesis ofhigher value products. In the latter case, active site proximity canlead to much faster turnover rates since intermediates are released inclose proximity to the subsequent active site, so that they can bequickly turned over before diffusing away.

Glycerol - a byproduct of FAM

E biodiesel production is usually extracted from the aqueous layer andpurified prior to its separate commercialization. A more efficient andvalue-added approach would be to simultaneously convert glycerol to acosmetic product during the transesterification process. Not only wouldthis save time by doing both reactions in one pot, but also removal ofglycerol might increase the efficiency and yield of FAME biodiesel bypulling the equilibrium of transesterification reaction towards theproducts via Le Chatelier’s principle.

Among glycerol-derived cosmetics, glycosylated glycerol such as2-O-(α-D-gluco-pyranosyl)-sn-glycerol (αGG) has gained a lot ofattention as a powerful moisturizing agent.³⁰ Interestingly, Goedl etal. demonstrated that they could produce this compound enzymaticallyfrom sucrose and glycerol using a sucrose phosphorylase enzyme fromLeuconostoc mesenteroides (LmSP).³⁰ Therefore, it would be plausible tosimultaneously convert oil to biodiesel and αGG using a one-potcombination of PML and LmSP (Scheme 3 in FIG. 16B).

Having already entrapped PML inside Cry3Aa crystals, the inventors’ nextstep was to first verify that LmSP can be entrapped inside Cry3Aacrystals as well. A plasmid was constructed that coexpresses Cry3Aa andLmSP on separate cry3Aa promoters. This plasmid was transformed andexpressed in Bt. SDS-PAGE analysis of the crystals indicated that LmSPwas entrapped inside Cry3Aa crystals with high efficiency - 33% of theprotein molecules in the crystals are LmSP (FIG. 12 a ). For theLmSP:PML dual catalyst construct, a copy of PML^(VG) was inserteddownstream of LmSP with its own ribosome binding site. After expressionof this new construct 3P3A3P[LmSP][PML^(VG)] in Bt and purification,SDS-PAGE analysis revealed that both LmSP and PML^(VG) were entrapped inCry3Aa crystals with high efficiency (FIG. 12 a ). These3P3A3P[LmSP][PML^(VG)] crystals displayed both high sucrosephosphorylase activity (FIG. 12 b ) and pNPP hydrolysis activity (FIG.12 c ).

Entrapment of PML Inside Cry3Aa* Crystals

A potential limitation of using native Cry3Aa crystals is that they aresoluble at alkaline pH. Thus, reactions performed at alkaline pH willcause Cry3Aa to solubilize and release the entrapped protein or enzyme.It was previously demonstrated that truncation of 18 amino acids fromthe C-terminus of Cry3Aa and expression in Bt resulted in the productionof particles that were much more resistant to solubilization. It washypothesized that if these C-terminally truncated crystals (Cry3Aa*)still retained pores they could also entrap proteins or enzymes in vivo.To test this, Cry3Aa* and two copies of PML^(VG) were put under a singlecry3Aa promoter with each protein having its own ribosome binding site.As expected, SDS-PAGE analysis demonstrated that PML^(VG) was entrappedin Cry3Aa* crystals with high efficiency and the resultingCry3Aa*[PML^(VG)]² crystals displayed high activity (FIG. 13 ).

Entrapment of PML Inside Cry3Aa-Fusion Crystals

One can envisage that if Cry3Aa-fusion crystals formed pore-likestructures they should also be capable of enzyme entrapment. Notably,this would be a facile strategy to produce multi-protein or multi-enzymeimmobilized systems. Some advantages to this system over co-entrapmentin native Cry3Aa crystals are that the fusion partner will be expressedin a 1:1 ratio as Cry3Aa, so proteins or enzymes that exhibit poorexpression levels can be fused to Cry3Aa or Cry3Aa*. The feasibility ofthis approach by coexpressing Cry3Aa* fused to SpyCatcher(Cry3Aa*-SpyCat) and PML^(VG) to produce Cry3Aa*-SpyCat[PML^(VG)]crystals (FIG. 13 a ) has already been demonstrated. While SpyCatcher(10 kDa) is not an enzyme, it can act as a reaction handle to conjugatecandidate enzymes fused with a Spytag to produce immobilized dual enzymecrystals.

It was recently discovered that CotA-laccase can convert linoleic acid,an abundant fatty acid in waste oils, into azelaic acid, a valuablechemical used in the cosmetic, pharmaceutical and polymer industries.¹⁶Thus instead of using methanol to produce biodiesel, lipase mediatedhydrolysis of the triacylglycerols in water will generate free linoleicacid, which can be subsequently converted to azelaic acid byCotA-laccase (Scheme 4 in FIG. 16B). Since CotA expression is low, it isa good candidate for conjugation to the Cry3Aa*-SpyCat[PML^(VG)]crystals to produce the lipase:lacasse catalyst. Preliminary resultsdescribed herein indicate successful conjugation of CotA-SpytagCry3Aa*-SpyCat[PML^(VG)] crystals, with full retention of lacasseactivity (FIG. 13 ).

Another example of entrapment of proteins inside Cry3Aa-fusion crystalsis the entrapment of PML^(VG) inside Cry3Aa*-lipA crystals. Wastecooking oils are economical feedstocks for biodiesel production, butthey have various compositions from source to source. Therefore,production of a single catalyst containing lipases with different chainspecificities would be a versatile catalyst for waste cooking oilconversion to biodiesel. The plan was to coexpress Cry3Aa, PML mentionedpreviously, and lipase A from Bacillus subitilis (lipA) in the same Btcell. PML prefers long chain fatty acids C₁₄-C₂₀ while lipA prefersmedium chain fatty acids C₆-C₁₂ Experimental data showed that PML^(VG)could be entrapped inside Cry3Aa*-lipA crystals and the resulting duallipase construct was an effective catalyst for waste cooking oilconversion to biodiesel (FIG. 14 ).

Example 2 Crystals Formed from Cry3Aa Mutants with Multiple NegativelyCharged Amino Acid Substitutions (NegCry3Aa) Enhance the Binding ofCertain Proteins

To investigate whether Cry3Aa protein crystal can be engineered toenhance its affinity to the encapsulated protein cargo, anegatively-charged Cry3Aa mutant (negCry3Aa) was generated bysubstituting specific residues within Domain II that are exposed to thecrystal solvent channel with negatively-charged amino acids, namely,aspartate (D) and glutamate (E). For initial studies, the followingmutations: K384E, N391D, N395D, S425E, Q430E, TQ436437EE, KR442443EE,T461D, and K467E (bold in the protein sequence alignment of negCry3Aa(SEQ ID NO:6) to Cry3Aa (SEQ ID NO:4) shown in FIG. 17 ) were engineeredinto the wild-type cry3aa gene, which was then used to make the finalexpression construct comprised of the negCry3Aa, a ribosome binding siteand the supercharged GFP (scGFP), see Lawrence MS, Phillips KJ, Liu DR.Supercharging proteins can impart unusual resilience. J Am Chem Soc.2007;129(33):10110-10112. This expression plasmid was then transformedinto Bt cells and grown for 48 h or 72 h. Cells were harvested and theresultant crystals (NegCry3Aa-RBS-scGFP) were purified by sucrosedensity gradient centrifugation, followed by extensive washing inice-cold ddH₂O. As shown in FIG. 18A, the NegCry3Aa-RBS-scGFP crystalpellets exhibited bright green fluorescence when imaged at the GFPexcitation wavelength while no such fluorescence was observed for thecontrol pellet. To further verify the encapsulation of scGFP innegCry3Aa, the negCry3Aa-RBS-scGFP crystal pellets were solubilized instandard SDS sample buffer and boiled for 5 min, followed by SDS-PAGE.As indicated in the SDS-PAGE gel in FIG. 18B, two major bandscorresponding to negCry3Aa (theoretical MW = 73.1 kDa) and scGFP(theoretical MW = 28.5 kDa), respectively, can be observed.

Entrapment of lipAR9 in negCry3Aa Crystals

The negCry3Aa mutant also can be used to improve the loading of alipase. When lipase A fused to polyarginine tail (lipAR9) wasco-expressed with negCry3Aa the loading improved (FIG. 19 ).

Example 3 Cry3Aa Fusion Protein Capable of Protein-Entrapping

The present inventors have demonstrated in their previous studies thatthe fusion of 1-3 repeats of a 56-residue methallothionein proteinhaving the following amino acid sequence (SEQ ID NO:8):MTSTTLVKCACEPCLCNVDPSKAIDRNGLYYCSEACADGHTGGSKGCGHTGCNCHG to theC-terminus of Cry3Aa via genetic fusion can still form fusion crystals(Cry3Aa-[SmtA]₁₋₃) in Bt (Sun, Q., Cheng, SW., Cheung, K., Lee, MM.,Chan, MK., “ Cry protein crystal-immobilized metallothioneins forbioremediation of heavy metals from water”, Crystals, 2019, 9(6): 287).

The Cry3Aa-[SmtA]i construct (a fusion of SEQ ID NO:4 and SEQ ID NO:8)was then tested to reveal whether it can entrap mCherry in vivo whenboth coding sequences for Cry3Aa-[SmtA]₁ and mCherry are co-expressed inBt. A pHT315 plasmid harboring the gene encoding sequences of theCry3Aa-[SmtA]₁, a ribosome binding site, and mCherry was transformedinto Bt and grown for 48-72 h. As indicated on the SDS-PAGE gel shown inFIG. 20 , the extracts from 40/60% sucrose layer (lane 4) contained theCry3Aa-[SmtA]₁ (MW ~ 78 kDa) and mCherry (MW ~26 kDa), thus indicatingthe entrapment of mCherry in Cry3Aa-[SmtA]₁.

Example 4 Changing the Size of Cry3Aa Channel

The size of the Cry3Aa solvent channel will influence the size of theprotein (e.g., an enzyme) that can be entrapped inside the crystal. Forinstance, a larger protein may require more space in the channel to beaccommodated without steric clash, while a smaller protein mightnecessitate a smaller channel to prevent its diffusion. In order toincrease the size of the channel, amino acids exposed to the solventchannel (FIG. 21 ) would need to be removed from the Cry3Aa sequence(SEQ ID NO:4). These regions include segments 110-128, 181-197, 211-226,250-264, 423-436, 457-471, 593-598, and 627-632 of Cry3Aa (SEQ ID NO:4).The specific amino acids that can be deleted are indicated in bold inthe Cry3Aa sequence (FIG. 22 ) and are listed in FIG. 23 .

In order to illustrate this for a potential target enzyme, COOT modelingwas used to fit a sucrose phosphorylase (LmSP) homologThermoanaerobacter thermosaccharolyticum 6F-phosphate phosphorylase (PDBID: 6S9V) in the Cry3Aa crystal channel. While the majority of LmSP canbe accommodated in the crystal, several loops (colored in red) clashwith the enzyme model (FIG. 24 a ). Notably, several of these regionsare the same as listed in FIG. 22 . One specific area that interfereswith LmSP accommodation is the region from 423-436. Therefore, aminoacids Gln 426, Tyr 427, Glu 433 and Ala 434 in this region were deletedfrom Cry3Aa to expand the channel size. When LmSP was co-expressed withthis new mutant of Cry3Aa (3A2-2, SEQ ID NO:7), significantly improvedentrapment of LmSP in the crystal was observed (FIG. 24 b ).

Thus, the present inventors have illustrated for the first time thatmodifying a Cry protein by amino acid insertions or deletions (forexample, insertion of at least 1, 2, 3, 4, 5, 6, or 7 amino acid, or adeletion of at least 1, 2, 3, 4, 5, 6, or 7 amino acids) at one or moreof the 8 specific regions (for example, at least 1, 2, 3, 4, 5, 6, or 7regions) shown in FIG. 22 can effectively reduce or enlarge theprotein’s channel size and can therefore make the resultantcrystal-forming protein to entrap a target protein (e.g., an enzyme)with higher level of efficacy, i.e., improved entrapment/retention rateand limited diffusion rate for the target protein.

Example 5 Incorporating Mutations in Cry3Aa to Stabilize the Crystal

A very important characteristic of a catalyst is its reusability. Cry3Aacrystals are inherently reusable since they are solid materials. If thecrystals solubilize, however, they will release the entrapped enzymeinto the bulk media. Chemical cross-linkers can be used to reduce thesolubility, but this adds an additional step to catalyst production andposes environmental concerns. Taking advantage of the geneticallyencoded nature of the Cry3Aa crystal, a more elegant approach is tointroduce stabilizing mutations at the intermolecular interfaces togenerate an intrinsically insoluble crystal.

One position where is Ser 145, whose Cα atom is only 3.7 Å away from theCα of Ser 145 of an adjacent Cry3Aa monomer, making them suitable fordisulfide bond formation (FIG. 25 a ). When Ser 145 was mutated tocysteine (3AS145C) the crystal became much more resistant tosolubilization at alkaline pH (FIG. 25 b ). Solubilization of thecrystals in DTT confirmed disulfide bond formation. Another potentialregion to stabilize is His 161, which interacts with Glu 168 of aneighboring Cry3Aa monomer (FIG. 25 c ). His 161 was mutated to Arg(3AH161R) to increase the pKa of the charged residue thereby maintainingan electrostatic interaction with Glu 168 across a broader range of pHvalues. This mutation also reduced the solubility of the Cry3Aa crystal(FIG. 25 b ). More importantly, when a modified Cry3Aa proteincontaining both the S145C and H161R mutations was co-expressed withthree copies of PML^(VG) (3ADM[PML^(VG)]3), the reusability of thecatalyst was enhanced during the production of biodiesel from wastecooking oil (FIG. 25 d ), increasing the total turnover number (TTN) by50% after 25 reaction cycles.

Example 6 Cry3Aa Crystal Entrapment for Protein Purification

It was previously demonstrated that lysozyme can bind to Cry3Aa crystalsin high occupancy, but can be easily released at high saltconcentration. It was later demonstrated that the highly positive chargenature of lysozyme was responsible for its binding to Cry3Aa, which hasa negatively charge patch of amino acids within its crystal channels.Based on this data, it was speculated that a positively charged patch ofamino acids (poly-arginine) can be fused to a protein, coexpress thefusion protein with Cry3Aa in Bacillus thuringiensis, capture the fusionprotein in vivo by the Cry3Aa crystal, and release the fusion protein invitro by adding NaCl. If successful, this can offer a new approach toexpress and purify soluble protein without the need for columns orexpensive reagents. To test this hypothesis, the coding sequence for apoly-arginine tail with 9 consecutive arginine residues (R9) was fusedto the coding sequence for lipase A from Bacillus subtilis (lipA, 19kDa) and inserted downstream of Cry3Aa coding sequence to generate theconstruct 3A[lipAR9]. As expected, the purified Cry3Aa crystalscontained lipAR9 as demonstrated by the high lipase activity of theresulting crystals (FIG. 26 a ). Notably, lipAR9 can be easily releasedfrom the crystal by addition of NaCl (FIG. 26 b ). These data show thatnot only can R9 be used as a fusion tag to encapsulate proteins andenzymes within Cry3Aa crystals, but they also confirm that the Cry3Aaplatform could be implemented as a general strategy for proteinpurification.

All patents, patent applications, and other publications, includingGenBank Accession Numbers, cited in this application are incorporated byreference in the entirety for all purposes.

1. A method for recombinantly expressing a protein, comprising: (1)obtaining bacterial cells comprising an expression cassette encoding theprotein and an expression cassette encoding a Cry protein, acrystal-forming fragment thereof, or a fusion protein capable of formingcrystals comprising the Cry protein or the crystal-forming fragmentthereof; and (2) culturing the bacterial cells under conditionspermissible for the expression of the protein and the Cry protein, thecrystal-forming fragment thereof, or the fusion protein, wherein the Cryprotein, the crystal-forming fragment thereof, or the fusion proteinforms crystal containing the protein upon both being expressed in thebacterial cells.
 2. The method of claim 1, wherein the protein is anenzyme.
 3. The method of claim 2, wherein the enzyme is a lipase,ligase, hydrolase, esterase, protease, or glycosidase.
 4. The method ofclaim 3, wherein the lipase is PML, PML^(VG), lipA, or lipAR9.
 5. Themethod of claim 1, wherein the Cry protein is Cry3Aa.
 6. The method ofclaim 1, wherein the fusion protein is Cry3Aa-[SmtA]₁.
 7. The method ofclaim 1, wherein the expression cassette encoding the protein and theexpression cassette encoding the Cry protein, crystal-forming fragmentthereof, or the fusion protein are one single expression cassette. 8.The method of claim 7, wherein the one single expression cassettecomprises (1) one copy of polynucleotide sequence encoding the Cryprotein, crystal-forming fragment thereof, or the fusion protein and (2)one copy or two or more copies of polynucleotide sequence encoding theprotein.
 9. The method of claim 8, wherein (1) the polynucleotidesequence encoding the Cry protein, crystal-forming fragment thereof, orthe fusion protein and (2) the polynucleotide sequence encoding theprotein are operably linked to one single promoter.
 10. The method ofclaim 9, wherein the one single promoter is operably linked to (1) onecopy of the polynucleotide sequence encoding the Cry protein,crystal-forming fragment thereof, or the fusion protein, followed by (2)one copy of the polynucleotide sequence encoding the protein, with oneribosome binding site between (1) and (2).
 11. The method of claim 9,wherein the one single promoter is operably linked to (1) one copy ofthe polynucleotide sequence encoding the Cry protein, crystal-formingfragment thereof, or the fusion protein, followed by (2) two or morecopies of the polynucleotide sequence encoding the protein, with oneribosome binding site between (1) and (2) and between two copies ofpolynucleotide sequence encoding the protein.
 12. The method of claim 8,wherein (1) the polynucleotide sequence encoding the Cry protein,crystal-forming fragment thereof, or the fusion protein and (2) thepolynucleotide sequence encoding the protein are operably linked to twoseparate promoters.
 13. The method of claim 12, wherein the two separatepromoters are two different promoters.
 14. The method of claim 12,wherein (1) the polynucleotide sequence encoding the Cry protein,crystal-forming fragment thereof, or the fusion protein and (2) thepolynucleotide sequence encoding the protein share one singletermination codon, resulting in one copy of the Cry protein,crystal-forming fragment thereof, or the fusion protein and two copiesof the protein being expressed.
 15. The method of claim 1, wherein theexpression cassette encoding the protein and the expression cassetteencoding the Cry protein, crystal-forming fragment thereof, or thefusion protein are two separate expression cassettes.
 16. The method ofclaim 1, wherein the fusion protein comprises the Cry protein orcrystal-forming fragment thereof and one or more heterologouspolypeptides at the N- and/or C-terminus.
 17. The method of claim 16,wherein the one or more heterologous polypeptides are 1-3 repeats of SEQID NO:8.
 18. The method of claim 1, wherein two or more proteins areexpressed with the Cry protein, the crystal-forming fragment thereof, orthe fusion protein and are contained within the crystal formed by theCry protein, the crystal-forming fragment thereof, or the fusionprotein.
 19. The method of claim 1, wherein the bacterial cells areBacillus subtilis (Bs) or Bacillus thuringiensis (Bt) cell or E. colicells.
 20. The method of claim 1, further comprising, prior to step (1),introducing into the bacterial cells the expression cassette encodingthe protein and the expression cassette encoding the Cry protein,crystal-forming fragment thereof, or the fusion protein.
 21. The methodof claim 1, further comprising, after step (2), isolating the crystalformed by the Cry protein, the crystal-forming fragment thereof, or thefusion protein and containing the protein.
 22. The method of claim 21,further comprising, after the isolating step, washing the crystal underconditions permissible for releasing the protein from the crystal,thereby releasing from the crystal and isolating the protein.
 23. Themethod of claim 21, wherein the protein is an enzyme.
 24. A crystalproduced by the method of claim
 1. 25. The crystal of claim 24, whereinthe protein is an enzyme.
 26. A method for performing a reaction,comprising the step of incubating the crystal of claim 25 with asubstrate to the enzyme under conditions permissible for the substrateto be catalyzed by the enzyme.
 27. The method of claim 26, wherein theenzyme is a lipase.
 28. The method of claim 27, wherein the lipase isPML, PML^(VG), lipA, or lipAR9.
 29. The method of claim 26, furthercomprising, after the reaction is completed, removing the reactionproduct and reusing the crystal in a second reaction.
 30. An in vitromethod of cocrystallizing Cry3Aa with at least one protein of interestby mixing soluble Cry3Aa protein with the protein of interest, andallowing the protein of interest to form Cry3Aa crystals with theprotein of interest entrapped.
 31. A crystal produced by the method ofclaim 30, wherein the protein of interest is an enzyme.
 32. A method forperforming a reaction, comprising the step of incubating the crystal ofclaim 31 with a substrate to the enzyme under conditions permissible forthe substrate to be catalyzed by the enzyme.
 33. A method for deliveringthe crystal of claim 24 to cells.
 34. The method of claim 33, whereinthe cells are macrophages, lymphocytes, cancer cells, red blood cells,epithelial cells, stem cells, or liver cells.